The Best Templates Match Technique For Example Based Machine Translation
نویسندگان
چکیده
It has been proved that large-scale realistic Knowledge Based Machine Translation (KBMT) applications require acquisition of huge knowledge about language and about the world. This knowledge is encoded in computational grammars, lexicons and domain models. Another approach – which avoids the need for collecting and analyzing massive knowledge-is the Example Based approach, which is the topic of this paper. We show through the paper that using Example Based in its native form is not suitable for translating into Arabic. Therefore a modification to the basic approach is presented to improve the accuracy of the translation process. The basic idea of the new approach is to improve the technique by which template-based approaches select the appropriate templates. It relies on extracting, from a parallel Bilingual Corpus, all possible templates that could match parts of the source sentence. These templates are selected as suitable candidate chunks for the source sentence. The corresponding Arabic templates are also extracted and represented by a diredted graph. Each branch represents one possible string of templates candidate to represent the target sentence. The shortest continuous path or the most probable tree branch is selected to represent the target sentence. Finally the Arabic translation of the selected tree branch is generated.
منابع مشابه
Learning Translation Templates From Bilingual Text
This paper proposes a two-phase example-based machine translation methodology which develops translation templates from examples and then translates using template matching. This method improves translation quality and facilitates customization of machine translation systems. This paper focuses on the automatic learning of translation templates. A translation template is a bilingual pair of sen...
متن کاملAutomatic Determination of Number of clusters for creating templates in Example-Based Machine Translation
Example-Based Machine Translation (EBMT), like other corpus based methods, requires substantial parallel training data. One way to reduce data requirements and improve translation quality is to generalize parts of the parallel corpus into translation templates. This automated generalization process requires clustering. In most clustering approaches the optimal number of clusters (N ) is found e...
متن کاملTemplate Extraction for a Bidirectional English-Filipino Machine Translation System
A bidirectional English-Filipino Example-based Machine Translation System that learns and uses templates is presented. The system uses machine learning techniques to initially extract templates from a given bilingual corpus. These templates are subsequently used for translating English input text into Filipino and vice versa. The system implements the similarity template learning algorithm perf...
متن کاملInducing Translation Templates for Example-Based Machine Translation
This paper describes an example-based machine translation (EBMT) system which relays on various knowledge resources. Morphologic analyses abstract the surface forms of the languages to be translated. A shallow syntactic rule formalism is used to percolate features in derivation trees. Translation examples serve the decomposition of the text to be translated and determine the transfer of lexical...
متن کاملAutomatically Extracting Templates from Examples for NLP Tasks
In this paper, we present the approaches used by our NLP systems to automatically extract templates for example-based machine translation and pun generation. Our translation system is able to extract an average of 73.25% correct translation templates, resulting in a translation quality that has a low word error rate of 18% when the test document contains sentence patterns matching the training ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1406.1241 شماره
صفحات -
تاریخ انتشار 2014